AITopics | strategy space

Collaborating Authors

strategy space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

55563844bcd4bba067fe86ac1f008c7e-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 23:25:01 GMT

artificial intelligence, machine learning, zero-sum game, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Add feedback

On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games

Neural Information Processing SystemsMar-21-2026, 20:46:04 GMT

First-order methods (FOMs) are arguably the most scalable algorithms for equilibrium computation in large extensive-form games. To operationalize these methods, a distance-generating function, acting as a regularizer for the strategy space, must be chosen. The ratio between the strong convexity modulus and the diameter of the regularizer is a key parameter in the analysis of FOMs.A natural question is then: what is the optimal distance-generating function for extensive-form decision spaces? In this paper, we make a number of contributions, ultimately establishing that the weight-one dilated entropy (DilEnt) distance-generating function is optimal up to logarithmic factors. The DilEnt regularizer is notable due to its iterate-equivalence with Kernelized OMWU (KOMWU)---the algorithm with state-of-the-art dependence on the game tree size in extensive-form games---when used in conjunction with the online mirror descent (OMD) algorithm. However, the standard analysis for OMD is unable to establish such a result; the only current analysis is by appealing to the iterate equivalence to KOMWU. We close this gap by introducing a pair of primal-dual treeplex norms, which we contend form the natural analytic viewpoint for studying the strong convexity of DilEnt. Using these norm pairs, we recover the diameter-to-strong-convexity ratio that predicts the same performance as KOMWU. Along with a new regret lower bound for online learning in sequence-form strategy spaces, we show that this ratio is nearly optimal.Finally, we showcase our analytic techniques by refining the analysis of Clairvoyant OMD when paired with DilEnt, establishing an $\mathcal{O}(n \log |\mathcal{V}| \log T/T)$ approximation rate to coarse correlated equilibrium in $n$-player games, where $|\mathcal{V}|$ is the number of reduced normal-form strategies of the players, establishing the new state of the art.

artificial intelligence, distance-generating function, proceedings, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Mechanism Design for Collaborative Normal Mean Estimation

Neural Information Processing SystemsFeb-16-2026, 02:31:39 GMT

However, simply pooling everyone's data and sharing with each other can lead to free-riding [

agent, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

Near-OptimalNo-RegretLearningDynamicsfor GeneralConvexGames

Neural Information Processing SystemsFeb-13-2026, 05:02:42 GMT

A recent line of work has established uncoupled learning dynamics such that, when employed by all players in a game, each player's regret after T repetitions grows polylogarithmically in T, an exponential improvement over the traditional guarantees within the no-regret framework. However, so far these results have only been limited to certain classes of games with structured strategy spaces--such as normal-form and extensive-form games. The question as to whether O(polylogT) regret bounds can be obtained for general convex and compact strategy sets--which occur in many fundamental models in economics and multiagent systems--while retaining efficient strategy updates is an importantquestion.

artificial intelligence, machine learning, normal-form game, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.55)

Add feedback

55563844bcd4bba067fe86ac1f008c7e-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 17:55:44 GMT

equilibrium, zero-sum bilinear game, zero-sum game, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Block-Coordinate Methods and Restarting for Solving Extensive-Form Games

Neural Information Processing SystemsDec-24-2025, 05:17:03 GMT

Coordinate descent methods are popular in machine learning and optimization for their simple sparse updates and excellent practical performance. In the context of large-scale sequential game solving, these same properties would be attractive, but until now no such methods were known, because the strategy spaces do not satisfy the typical separable block structure exploited by such methods.We present the first cyclic coordinate-descent-like method for the polytope of sequence-form strategies, which form the strategy spaces for the players in an extensive-form game (EFG). Our method exploits the recursive structure of the proximal update induced by what are known as dilated regularizers, in order to allow for a pseudo block-wise update.We show that our method enjoys a O(1/T) convergence rate to a two-player zero-sum Nash equilibrium, while avoiding the worst-case polynomial scaling with the number of blocks common to cyclic methods. We empirically show that our algorithm usually performs better than other state-of-the-art first-order methods (i.e., mirror prox), and occasionally can even beat CFR$^+$, a state-of-the-art algorithm for numerical equilibrium computation in zero-sum EFGs. We then introduce a restarting heuristic for EFG solving. We show empirically that restarting can lead to speedups, sometimes huge, both for our cyclic method, as well as for existing methods such as mirror prox and predictive CFR$^+$.

block-coordinate method and restarting, extensive-form game, name change, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Game Theory (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Adaptively Coordinating with Novel Partners via Learned Latent Strategies

Li, Benjamin, Shi, Shuyang, Romero, Lucia, Li, Huao, Xie, Yaqi, Kim, Woojun, Nikolaidis, Stefanos, Lewis, Michael, Sycara, Katia, Stepputtis, Simon

arXiv.org Artificial IntelligenceNov-18-2025

Adaptation is the cornerstone of effective collaboration among heterogeneous team members. In human-agent teams, artificial agents need to adapt to their human partners in real time, as individuals often have unique preferences and policies that may change dynamically throughout interactions. This becomes particularly challenging in tasks with time pressure and complex strategic spaces, where identifying partner behaviors and selecting suitable responses is difficult. In this work, we introduce a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a broad range of potential partner strategies in real-time. Our approach encodes strategies with a variational autoencoder to learn a latent strategy space from agent trajectory data, identifies distinct strategy types through clustering, and trains a cooperator agent conditioned on these clusters by generating partners of each strategy type. For online adaptation to novel partners, we leverage a fixed-share regret minimization algorithm that dynamically infers and adjusts the partner's strategy estimation during interaction. We evaluate our method in a modified version of the Overcooked domain, a complex collaborative cooking environment that requires effective coordination among two players with a diverse potential strategy space. Through these experiments and an online user study, we demonstrate that our proposed agent achieves state of the art performance compared to existing baselines when paired with novel human, and agent teammates.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.12754

Country: North America > United States (0.68)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Mechanism Design for Collaborative Normal Mean Estimation

Neural Information Processing SystemsOct-9-2025, 02:29:15 GMT

However, simply pooling everyone's data and sharing with each other can lead to free-riding [

agent, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

Learning to Play Multi-Follower Bayesian Stackelberg Games

Personnat, Gerson, Lin, Tao, Hossain, Safwan, Parkes, David C.

arXiv.org Artificial IntelligenceOct-3-2025

In a multi-follower Bayesian Stackelberg game, a leader plays a mixed strategy over $L$ actions to which $n\ge 1$ followers, each having one of $K$ possible private types, best respond. The leader's optimal strategy depends on the distribution of the followers' private types. We study an online learning version of this problem: a leader interacts for $T$ rounds with $n$ followers with types sampled from an unknown distribution every round. The leader's goal is to minimize regret, defined as the difference between the cumulative utility of the optimal strategy and that of the actually chosen strategies. We design learning algorithms for the leader under different feedback settings. Under type feedback, where the leader observes the followers' types after each round, we design algorithms that achieve $\mathcal O\big(\sqrt{\min\{L\log(nKA T), nK \} \cdot T} \big)$ regret for independent type distributions and $\mathcal O\big(\sqrt{\min\{L\log(nKA T), K^n \} \cdot T} \big)$ regret for general type distributions. Interestingly, those bounds do not grow with $n$ at a polynomial rate. Under action feedback, where the leader only observes the followers' actions, we design algorithms with $\mathcal O( \min\{\sqrt{ n^L K^L A^{2L} L T \log T}, K^n\sqrt{ T } \log T \} )$ regret. We also provide a lower bound of $Ω(\sqrt{\min\{L, nK\}T})$, almost matching the type-feedback upper bounds.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.01387

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry: